Double Chain-Star: an RDF indexing scheme for fast processing of SPARQL joins

نویسندگان

  • Marios Meimaris
  • George Papastefanatos
چکیده

State of the art RDF stores often rely on exhaustive indexing and sequential (self-)joins for SPARQL query processing. However, query execution is dependent on, and often limited by the underlying storage and indexing schemes. Even though RDF can give birth to datasets with loosely defined schemas, it is common for an emerging structure to be present in the data. In this paper we introduce a novel indexing scheme, called Double Chain Star (DCS), that takes advantage of the inherent structure that is often found in RDF datasets by extending the notion of Characteristic Sets to cater for chain-star joins. DCS essentially reduces pairs of chain-star patterns that typically involve multiple self-joins, to mere index scans. We perform preliminary experiments and show promising results in comparison with Jena TDB and RDF-3X.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

H2RDF+: High-performance distributed joins over large-scale RDF graphs

The proliferation of data in RDF format calls for efficient and scalable solutions for their management. While scalability in the era of big data is a hard requirement, modern systems fail to adapt based on the complexity of the query. Current approaches do not scale well when faced with substantially complex, non-selective joins, resulting in exponential growth of execution times. In this work...

متن کامل

RDFMatView: Indexing RDF Data using Materialized SPARQL queries

The Semantic Web aims to create a universal medium for the exchange of semantically tagged data. The idea of representing and querying this information by means of directed labelled graphs, i.e., RDF and SPARQL, has been widely accepted by the scientific community. However, even when most current implementations of RDF/SPARQL are based on ad-hoc storage systems, processing complex queries on la...

متن کامل

RDFMatView: Indexing RDF Data for SPARQL Queries

The Semantic Web is now gaining momentum due to its efforts to create a universal medium for the exchange of semantically tagged data. The representation and querying of semantic data have been made by means of directed labelled graphs using RDF and SPARQL, standards which have been widely accepted by the scientific community. Currently, most implementations of RDF/SPARQL are based on relationa...

متن کامل

Distributed Processing of Generalized Graph-Pattern Queries in SPARQL 1.1

We propose an efficient and scalable architecture for processing generalized graph-pattern queries as they are specified by the current W3C recommendation of the SPARQL 1.1 “Query Language” component. Specifically, the class of queries we consider consists of sets of SPARQL triple patterns with labeled property paths. From a relational perspective, this class resolves to conjunctive queries of ...

متن کامل

Cascading map-side joins over HBase for scalable join processing

One of the major challenges in large-scale data processing with MapReduce is the smart computation of joins. Since Semantic Web datasets published in RDF have increased rapidly over the last few years, scalable join techniques become an important issue for SPARQL query processing as well. In this paper, we introduce the Map-Side Index Nested Loop Join (MAPSIN join) which combines scalable index...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016